4 ◾ Bioinformatics
C+T, and C). The order of the nucleotides (A, C, G, and T) in the DNA sequence can then
be solved from the bands on the gel.
On the other hand, the steps of the Sanger sequencing method are similar to that of the
polymerase chain reaction (PCR) including denaturing, primer annealing, and comple-
mentary strand synthesis by polymerase. However, in the Sanger sequencing, the sample
DNA is divided into four reaction tubes labeled ddATP, ddGTP, ddCTP, and ddTTP. In the
four reaction tubes, the four types of deoxynucleotides triphosphates (dATP, dGTP, dCTP,
and dTTP) are added as in the PCR but one of the four radio-labeled dideoxynucleotide
triphosphates (ddATP, ddGTP, ddCTP, or ddTTP) is also added to the reactions, as labeled,
to terminate the DNA synthesis at certain positions of known nucleotides. The synthesis
termination results in DNA fragments of varying lengths ending with the labeled ddNTPs.
Those fragments are then separated by size using gel electrophoresis on a denaturing poly-
acrylamide-urea gel with each of the four reactions running in a separate lane labeled A,
T, G, and C. The DNA fragments will be separated by lengths; the smaller fragments will
move faster in the gel. The DNA bands are then graphed by autoradiography, and the order
of the nucleotide bases on the DNA sequence can be directly read from the X-ray film or
the gel image.
1.2.2 Next-Generation Sequencing
The next-generation sequencing (NGS) was invented a few decades after the invention of
the first-generation sequencing. Unlike the first-generation sequencing, NGS produces
massive number of sequences from a single sample in a short period of time, with lower
costs, and it can process multiple samples simultaneously. Millions to billions of DNA
nucleotides are sequenced in parallel, yielding substantially massive sequences. With the
NGS, millions of prokaryotic, eukaryotic, and viral genomes were sequenced. Rather
than chain termination, the NGS uses library or fragmented DNA to solve the order of
the nucleotide in a targeted sequence. The NGS is used in many applications including
the sequencing of the whole genome, whole transcriptome, targeted genes or transcripts,
sequencing of the genomic regions where the epigenetic modifications or protein interac-
tions take place. Hence, the NGS can be used for genome assembly, mutation or variant
discovery, gene expression studies, epigenetics, and metagenomics. Those applications are
discussed in detail in the next chapters.
After DNA or RNA sample collection, the step of the NGS process is the library prepa-
ration in which the sequencing libraries are constructed for the DNA or RNA sample of
interest. The RNA is converted into complementary DNA (cDNA) before library prepara-
tion. The library preparation involves breaking the DNA into small fragments (fragmen-
tation step) using sonication or enzymes. The size of the fragment can be adjusted to a
specific length or range. The fragmentation is followed by repairing or blunting the ends of
the fragments which have unpaired or overhanging nucleotides (end-repair step). The next
step is the ligation of the adaptors to the ends of the DNA fragments. The adaptors are arti-
ficially synthesized sequences that include certain parts to serve specific purposes. The free
end of a ligated adaptor is made of an anchoring sequence that can attach to the surface of
the flow cell slide where sequencing takes place. The adaptor also includes universal primers